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ABSTRACT 

Transplant  glomerulopathy  (TG)  is  associated  with  rapid  decline  in  glomerular  filtration  rate  and  poor 
outcome.  We  utilized  low-density  arrays  with  a  novel  probabilistic  analysis  to  characterize  relationships 
between  gene  transcripts  and  the  development  of  TG  in  allograft  recipients.  Retrospective  review 
identified  TG  in  10.8%  of  963  core  biopsies  from  166  patients;  patients  with  stable  function  (SF)  were 
studied  for  comparison.  The  biopsies  were  analyzed  for  expression  of  87  genes  related  to  immune 
function  and  fibrosis  using  real-time  PCR,  and  a  Bayesian  model  was  generated  and  validated  to  predict 
histopathology  based  on  gene  expression.  A  total  of  57  individual  genes  were  increased  in  TG  compared 
with  SF  biopsies  (p<0.05).  The  Bayesian  analysis  identified  critical  relationships  between  ICAM-1,  IL- 
10.  CCL3.  CD86,  VCAM-1,  MMP-9,  MMP-7,  and  LAMC2  and  allograft  pathology.  Moreover,  Bayesian 
models  predicted  TG  when  derived  from  either  immune  function  { AUC  (95%  Cl)  of  0.875  (0.675-0.999), 
p=0.004}  or  fibrosis  {AUC  (95%  Cl)  of  0.859  (0.754-0.963),  p<0.001 }  gene  networks.  Critical  pathways 
in  the  Bayesian  models  were  also  analyzed  using  the  Fisher  exact  test  and  had  p-values  <  0.005.  This 
study  demonstrates  that  evaluating  quantitative  gene  expression  profiles  with  Bayesian  modeling  can 
identify  significant  transcriptional  associations  that  have  the  potential  to  support  the  diagnostic  capability 
of  allograft  histology.  This  integrated  approach  has  broad  implications  in  the  field  of  transplant 


diagnostics. 
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INTRODUCTION 

Long-term  kidney  allograft  function  continues  to  improve  only  modestly,  despite  dramatic  improvements 
in  acute  rejection  rates  and  short  term  patient  and  graft  survivals. 'in  spite  of  its  limitations,  measurement 
of  serum  creatinine  remains  the  primary  monitoring  modality  following  kidney  transplantation. 

Significant  changes  in  serum  creatinine,  and/or  the  development  of  proteinuria,  result  in  a  series  of 
maneuvers  to  define  the  many  potential  etiologies  of  acute  and  chronic  allograft  dysfunction.  Allograft 
biopsy  is  the  “gold-standard”  of  these  maneuvers,  although  morphologic  analysis  may  not  easily 
distinguish  these  etiologies.  Furthermore,  the  analysis  may  be  limited  in  regards  to  prognostic  importance 
and  functional  outcome.  Thus,  identification  of  biomarkers  of  allograft  failure  and  the  development  of 
tools  for  their  interpretation  is  of  critical  interest,  both  in  providing  disease  detection  in  a  more  sensitive 
and  specific  fashion,  and  in  allowing  sufficient  lead  time  for  intervention.  Additionally,  such  markers  may 
allow  for  risk  assessment  and  medical-regimen  tailoring  that  is  personalized  to  provide  optimum 
outcomes. 

Transplant  glomerulopathy  (TG)  is  a  disease  of  the  kidney  allograft  initiated  by  endothelial  injury. 
Morphologically,  there  is  widening  of  the  subendothelial  space  with  accumulation  of  debris,  mesangial 
interpositioning,  and  matrix  deposition  in  the  glomerular  capillary  wall,  as  well  as  capillary  wall  double- 
contouring  in  the  absence  of  immune  complex  deposition.2  Electron  microscopy  may  show  endothelial 
cell  separation  from  the  glomerular  basement  membrane  prior  to  light  microscopic  changes.  The  etiology 
of  TG  is  under  considerable  scrutiny.  Prior  studies  implicated  an  antibody  mediated  response3"5,  but  this 
has  not  been  consistently  demonstrated.6' 7  Accompanying  this  lesion  may  be  evidence  of  chronic  injury, 
including  interstitial  fibrosis  and  tubular  atrophy  (IF/TA),  the  hallmarks  of  chronic  allograft  nephropathy.8 
Clinical  presentation  often  occurs  a  year  or  more  after  transplantation,  although  in  the  context  of  protocol 
kidney  biopsies,  light  microscopic  changes  may  be  seen  earlier,  with  associated  proteinuria,  hypertension, 
and  a  progressive  decline  in  function  culminating  in  graft  loss.9  Importantly,  there  is  no  specific  effective 
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therapeutic  strategy  beyond  augmentation  of  immunosuppression.  Thus,  identifying  pathogenic  mediators 
not  only  for  therapeutic  puiposes  but  also  for  early  identification  may  lead  to  improved  outcomes. 

In  this  study,  we  assess  the  potential  of  a  novel  diagnostic  method  utilizing  custom  low  density  gene 
expression  arrays  and  machine  learning  algorithms  in  an  effort  to  determine  the  transcriptional  features 
associated  with  TG  and  to  begin  to  identify  biomarkers  that  may  be  indicative  of  TG.  While  there  has 
been  some  research  in  identifying  biomarkers  of  TG,  we  have  yet  to  see  the  evaluation  of  a  systems 
biology  approach  to  this  problem.  We  focused  on  transcripts  that  have  been  associated  with  other  forms 
of  acute  and  chronic  renal  allograft  injury  in  kidney  allograft  recipients  with  the  intent  of  evaluating  a 
systems  biology  modeling  approach.  Initial  data  analysis  using  conventional  statistical  methods 
confirmed  the  proinflammatory  state  of  this  lesion.10  Incorporation  of  these  data  utilizing  machine¬ 
learning  software,  however,  derived  statistically  significant  yet  substantially  novel  associations  between 
individual  transcripts.  We  performed  this  analysis  specifically  to  assess  the  potential  value  of  a  graphical, 
hierarchical  model  of  conditional  dependence  in  generating  novel  hypotheses  and  providing  guidance  in 
patient  classification.  Moreover,  the  resulting  model  provides  insight  into  the  probable  pathogenesis  of 
TG  and  a  set  of  potential  biomarkers  to  test  and  characterize  recipients  at  risk  for  disease.  These  results 
highlight  the  hypothesis-generating  potential  of  this  method  by  elucidating  potential  pathways  for 
investigation  and  the  decision-supportive  utility  of  defined,  quantitative  classification  models  of  disease 
versus  health  states. 

METHODS 

Patient  selection  and  evaluation 

Protocols  were  approved  by  the  Institutional  Review  Board  of  the  National  Institutes  of  Health  and 
included  informed  consent.  Retrospective  review  of  963  renal  transplant  core  biopsies  (166  patients) 
identified  TG  in  20  biopsies  (18  patients;  10.8%).  A  cohort  of  32  biopsies  (19  patients)  of  stable  function 
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(SF)  allografts  was  studied  for  comparison.  SF  was  defined  as  at  least  6  months  post-transplant  without 
change  in  renal  function  and  the  absence  of  any  significant  histological  or  clinical  abnormalities. 

Immunosuppression  included  induction  in  94.6%  (n=35)  using  rabbit  anti-thymocyte  globulin  (40.5%; 
n=15),  Alemtuzumab  (29.7%;  n=l  1),  Daclizumab  (18.9%;  n=7),  or  solumedrol  alone  (5.4%;  n=2). 
Patients  were  maintained  on  monotherapy  with  tacrolimus  or  sirolimus  (56.8%;  n=21),  triple 
immunosuppressive  therapy  including  tacrolimus  or  sirolimus,  MMF,  and  prednisone  (29.7%;  n=l  1),  or 
other  maintenance  regimen  variations  (14.5%;  n=5). 

Patients  were  routinely  screened  post-transplant  for  anti-FILA  antibodies  with  solid  phase  Class  I  and  II 
ELISA  or  multi-antigen  synthetic  flow  bead  (Tepnel,  Stamford,  CT)  testing  with  the  Luminex  system 
(Luminex  Inc.,  Austin,  TX).  Positive  sera  were  subsequently  tested  using  specific  HLA  antigen-coated 
flow  beads  (One  Lambda,  Inc.,  Canoga  Park,  CA). 

Biopsy  acquisition,  preparation,  and  evaluation 

Protocol  biopsies  were  obtained  routinely  at  time  of  transplantation  and  at  1,6,  12,  36,  and  in  some  cases 
60  months.  A  portion  of  the  cortex  was  snap  frozen  as  previously  described.1 1  The  percent  cortex  was 
obtained  for  each  sample  for  validation.  For  all  biopsies  this  was  75.71125.52.  The  remaining  portions 
were  fixed  in  formalin,  sectioned,  and  stained. 

All  biopsies  were  evaluated  in  masked  fashion  by  a  single  dedicated  pathologist  and  scored  using  the 
Banff  classification.12 14  The  diagnosis  of  TG  was  based  on  the  presence  of  glomerular  basement 
membrane  duplication  and  severity  scored  on  the  percentage  of  glomerular  peripheral  capillary  loops 
involved  in  affected,  non-sclerotic  glomeruli.1 '  Patients  with  duplication  of  the  glomerular  basement 
membrane  due  to  recurrent  disease  were  excluded.  C4d  staining  was  performed  in  43/52  samples  using  an 


immunoperoxidase  technique. 
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Quantitative  real-time  polymerase  chain  reaction  (qPCR) 

RNA  was  extracted  and  converted  to  cDNA.15  cDNA  (lOOng)  was  used  for  qPCR  using  a  Low  Density 
Array  (Applied  Biosystems  Inc.,  Foster  City,  CA).16  Two  groups  of  targets  (Table  1)  were  quantitated:  43 
transcripts  related  to  general  immune  function  (Gene  Panel  1  -  GP1)  and  45  transcripts  related  to  allograft 
fibrosis  pathways  (Gene  Panel  2  -  GP2).  Each  target  was  analyzed  in  quadruplicate.  18S  ribosomal  RNA 
was  used  as  an  internal  control.  Individual  samples  were  compared  with  pooled  cDNA  from  live  donors 
undergoing  open  donor  nephrectomy.  Transcript  quantification  was  derived  using  the  comparative 
threshold  cycle  method,16  and  reported  as  //-fold  difference. 

Statistical  Analysis 

SF  and  TG  patient  clinical  variables  were  compared  and  analyzed  using  SPSS  (SPSS  16.0,  SPSS  Inc., 
Chicago,  IL).  Associations  between  categorical  variables  were  studied  with  Fisher  exact  test  or  y_2tcst,  as 
appropriate.  Continuous  variables  of  normally  distributed  data  were  assessed  with  the  Student  /-test. 
Relative-fold  expression  data  were  compared  using  independent  sample  Student  /-test.  A  two-tailed  p 
value  <  0.05  was  considered  statistically  significant. 

Multivariate  dependence  relationships  were  also  analyzed  with  FasterAnalytics™  modeling  software 
(Decision  Q,  Washington,  DC),  a  machine  learning  software  package  used  to  develop  graphical  models  of 
conditional  dependence  (Bayesian  networks).  Machine  learned  Bayesian  networks  identify  conditional 
dependence  between  variables  and  present  this  structure  to  the  user  in  an  intuitive,  graphical  format. 
Preliminary  modeling  included  diagnosis  (DX;  TG  or  SF),  C4d  staining  grade  (Banff  C4d;  0,  1,2,  or  3), 
and  each  gene  of  either  GP1  or  GP2.  Quantitative  transcript  relative-expression  data  were  categorized  by 
distribution  into  three  equal  probability  density  groups.  Models  were  derived  in  step-wise  iterations  until 
the  optimal  network  was  identified  as  determined  by  cross-validation  and  qualitative  assessment  against 
clinical  experience  and  the  literature.  To  evaluate  robustness  of  the  Bayesian  models,  10%  of  the  biopsy 
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data  were  randomly  excluded  from  each  dataset  in  10  non-overlapping  iterations  for  a  10-fold  cross- 
validation.17  To  further  validate  the  statistical  significance  of  our  findings,  we  used  the  directed  graph 
produced  by  the  model  and  performed  a  Fisher’s  exact  test  to  quantify  the  statistical  significance  of  each 
arc  on  a  bi-variate  basis  between  the  nodes  at  each  end  of  the  arc.  Series  divided  into  the  same  reference 
ranges  as  in  the  Bayesian  network  and  for  each  pair,  a  p-value  was  calculated  using  Fisher’s  exact  test. 

RESULTS 

Patient  demographics  and  biopsy  characteristics 

Patient  demographics  were  matched  between  the  SF  and  TG  patients  (Table  2).  Patients  with  TG  had 
higher  mean  pre-transplant  peak  PRA  and  were  more  likely  to  have  had  a  prior  transplant,  but  these 
associations  were  not  statistically  significant.  Flowever,  patients  with  TG  had  a  significantly  higher 
incidence  of  post-transplant  donor  specific  antibody  (DSA)  and  graft  loss,  p<0.05,  consistent  with  prior 
reported  experiences.5' 8 

Mean  creatinine  at  time  of  biopsy  was  significantly  higher  in  TG  (2.5  ±1.1  mg/dL)  compared  with  SF  (1.3 
±0.3  mg/dL)  allografts  (p<0.001 ).  Significantly  more  biopsies  from  TG  allografts  were  obtained  for 
clinical  cause  (p=0.001;  Table  3),  and,  not  surprisingly,  C4d  immunostaining  was  more  often  positive 
in  TG  biopsies  than  in  SF  (p=0.013).  By  definition,  the  TG  biopsies  had  a  higher  chronic 
glomerulopathy  (CG)  grade  (2.7  ±0.5)  than  SF  biopsies,  p<0.001. 

Allograft  transcriptional  profile 

Thirty-two  transcripts  related  to  immune  function  were  significantly  (p<0.05)  increased  in  TG  compared 
with  SF,  suggesting  upregulation  of  inflammatory  pathways  involving  multiple  immune  mechanisms 
(Figure  1A-G).  Moreover,  twenty-five  transcripts  related  to  interstitial  fibrosis  were  increased  in  TG 


compared  with  SF  (p<0.05). 
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Graphical  Bayesian  network  of  conditional  dependence  in  transcript  expression 

Machine-learned  Bayesian  network  modeling  was  used  to  identify  conditional  dependence  relationships 
between  gene  transcript  expression  data  and  allograft  pathology  (TG  versus  SF  as  defined  in  Methods). 
GP1  and  GP2  expression  data  were  modeled  independently  and  optimized  relative  to  diagnosis.  Below  we 
focus  on  those  genes  found  to  be  most  interdependent  on  diagnosis  within  the  generated  networks. 

In  GP1  (Figure  2A),  ICAM-1, 1L-10,  CCL3,  and  CD86  shared  the  strongest  conditional  dependence  with 
diagnosis  in  the  machine-learned  model.  To  further  validate  our  findings,  we  used  the  associations 
identified  in  the  GP1  Bayesian  network  to  perform  Fisher  exact  testing  and  found  the  conditional 
dependence  associations  to  be  highly  statistically  significant,  with  all  associations  having  a  p-value  of 
0.01  or  less.  (Table  4)  Similar  to  our  findings  above,  multiple  additional  T-cell  function,  costimulatory, 
chemotaxis,  and  cytokine  transcripts  were  also  related  to  allograft  pathology.  In  SF,  relative  transcript 
expression  is  lower  within  the  network  (Figure  2B);  conversely,  in  TG  the  relative  expression  of  these 
transcripts  is  generally  higher  than  in  the  controls  (Figure  2C).  In  this,  the  unsupervised,  machine-learned 
Bayesian  analysis  has  identified  intriguing  novel  and  potentially  important  relationships  among  the  gene 
expression  profiles  and  allograft  pathology  for  further  exploration  and  biological  validation. 

This  Bayesian  network  further  enabled  investigation  of  the  possible  influence  of  specific,  coincident  gene 
expression  on  allograft  pathology.  More  specifically,  classification  model  GP1  allowed  us  to  estimate  the 
probability  of  outcome  in  the  study  population  by  using  available  evidence  and  without  necessitating  a 
full  understanding  of  the  underlying  biological  pathways.  Setting  the  evidence  of  coincidentally  elevated 
expression  of  ICAM-1  (>1.84  fold),  IL-10  (>16.9  fold),  and  CCL3  (>3.15  fold)  increased  the  probability 
of  TG  to  99.67%  as  opposed  to  SF  (Figure  3A).  This  served  as  a  demonstration  of  allograft-outcome 
probability  calculation  using  transplant-specific  gene  expression  data  and  an  internally  cross-validated 


model. 
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With  increased  expression  (>8.89  fold)  of  the  costimulatory  molecule  CD86,  multiple  related  transcripts 
are  also  increased  as  illustrated  by  the  adjusted  probability  distributions  throughout  the  network  (Figure 
3B).  In  addition,  by  increasing  only  CD86  expression,  the  probability  of  TG  also  increased  to  80.61%. 
This  interactive,  evidence-based  feature  of  Bayesian  network  analysis  facilitates  elucidation  of  potentially 
novel  biological  pathways. 

Comparative  transcript  and  histological  Bayesian  network  probability  of  allograft  pathology 

In  the  Bayesian  model  of  GP2,  VCAM-1,  MMP-9,  MMP-7,  and  LAMC2  are  critically  related  to 
pathology  (Figure  4A).  To  further  validate  our  findings,  we  used  the  associations  identified  in  the  GP2 
Bayesian  network  to  perform  Fisher  exact  testing  and  found  the  conditional  dependence  associations  to  be 
highly  statistically  significant,  with  all  associations  having  a  p-value  of  0.001  or  less.  (Table  5)  Coupling 
histological  criteria  with  transcript  expression  in  the  context  of  a  cross-validated  classification  model 
provided  a  powerful  predictor  of  allograft  pathology.  The  probability  of  TG  with  a  C4d  grade  of  3  alone 
is  81.25%  (Figure  4B;  4D)  while,  not  unexpectedly,  the  probability  of  SF  with  a  C4d  grade  of  0  is  83.9% 
(Figure  4D).  However,  when  coupled  to  increased  expression  of  VCAM-1  (>1.96  fold),  MMP-9  (>5.34 
fold),  MMP-7  (>2.77  fold),  and  LAMC2  (>2.19  fold)  the  probability  of  TG  increased  to  99.67%  (Figure 
4C).  Furthermore,  with  increased  expression  of  only  LAMC2  (>2.19  fold)  and  MMP-7  (>2.77  fold),  the 
probability  of  TG  is  95.6%;  with  decreased  expression  of  LAMC2  (<0.52  fold)  and  MMP-7  (<1.04  fold), 
the  probability  of  SF  is  99.1%  (Figure  4E). 

Bayesian  model  validation  for  transcript  network  prediction  of  allograft  pathology 

Using  10-fold  cross-validation  analysis,  both  models  for  GP1  and  GP2  estimated  allograft  pathology.  GP1 
estimated  TG  with  an  AUC  (95%  Cl)  of  0.875  (0.675-0.999),  p=0.004  and  sensitivity,  specificity,  positive 
predictive  value,  and  negative  predictive  value  of  85.7%,  87.5%,  92. 3%, and  77.8%,  respectively;  GP2 
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estimated  TG  with  an  AUC  (95%  Cl)  of  0.859  (0.754-0.963),  p<0.001and  sensitivity,  specificity,  positive 
predictive  value,  and  negative  predictive  value  of  80.0%,  84.4%,  76.2%,  and  87.1%,  respectively. 

Of  interest  is  the  models  ability  to  predict  outcome  in  sequential  biopsies.  Thirteen  cross-validation 
models  were  generated  using  the  same  discretization  as  the  overall  Bayesian  network  (Table  6).  For  each 
training  dataset,  a  single  patient’s  multiple  biopsies  were  removed.  The  removed  patient  data  were  then 
used  to  test  the  new  validation  model.  This  cohort  of  patients  with  multiple  biopsies  included  28  biopsies 
from  13  patients,  of  which  4  biopsies  (2  patients)  were  diagnosed  with  TG  and  24  biopsies  were 
diagnosed  as  SF.  A  comparison  of  resulting  model  predictions  to  pathological  findings  yields  one  patient 
with  one  biopsy  discrepantly  identified,  while  the  other  biopsy  had  confirmatory  identification;  one 
patient  with  both  biopsies  discrepantly  identified;  two  patients  with  both  of  their  biopsies  confirmed  as 
TG;  and  nine  patients  with  each  of  their  biopsies  confirmed  as  SF. 

The  analysis  of  sequential  renal  graft  biopsies  yielded  consistent  transcript  expression  profiles  within 
GP2.  Each  biopsy,  when  classified  by  the  same  Bayesian  network,  exhibited  probability  of  diagnosis 
similar  to  its  subsequent,  matched  sample  with  the  exception  of  two  biopsies  from  one  graft,  patient  002- 
003.  An  additional  point  is  that  two  of  the  three  SF  biopsies  that  were  discrepantly  classified  as  TG  by 
the  model  were  from  the  same  graft,  patient  002-014,  and  were  in  agreement  with  each  other  with  a 
probability  greater  than  0.84. 

DISCUSSION 

In  this  study  we  attempted  to  characterize  a  panel  of  genes  associated  with  TG  using  a  novel  machine¬ 
learning  methodology  producing  Bayesian  networks  to  evaluate  a  systems  biology  approach  in 
identifying  novel  biomarkers  of  TG.  As  such,  we  have  demonstrated  that  multiple  gene  transcripts 
associated  with  immune  function  and  fibrosis  are  transcriptionally  active  in  TG,  have  statistically 
significant  association  with  outcome,  and  may  have  the  potential  to  be  highly  predictive  of  abnormal 


10 


outcome.  We  believe  that  with  further  development  and  prospective  clinical  validation,  this  integrated 
approach  will  enable  further  understanding  of  this  disease  process  and  allow  for  the  development  of  a 
clinically  relevant  diagnostic  of  allograft  pathology  and  ultimately  provide  biomarkers  for  use  in  clinical 
trials. 

When  examined  using  traditional  statistics,  gene  transcripts  related  to  T-cell  activation,  effector  function, 
costimulation,  chemotaxis,  and  endothelial  activation  (Gene  Panel  1)  are  all  up-regulated  in  our  recipients 
with  TG  (Figure  1).  Furthermore,  transcripts  associated  with  epithelial-mesenchymal  transformation, 
cytoskeleton  structure,  and  growth  regulation  (Gene  Panel  2)  are  also  up-regulated.  While  it  is  evident 
that  the  transcriptional  profile  within  biopsies  histologically  identified  as  positive  for  TG  is  markedly 
different  from  a  stable  functioning  allograft,  these  differences  indicate  greater  general  immune  activation 
without  identifying  specific  pathways.  Such  findings  have  not  been  previously  reported  as  many  studies 
of  TG  have  focused  on  histologic  changes  and  association  with  alloantibody18.  As  the  comprehensive 
interpretation  of  such  datasets  has  remained  difficult,  we  believe  that  our  novel  approach  provides  a 
method  for  the  generation  of  new  hypothetical  pathways. 

Methods  to  analyze  complex,  heterogeneous  data  sets  incorporating  the  relationships  between  clinical, 
histological,  and  transcriptional  variables  have  enormous  utility  in  clinical  research  and  application.  In 
order  to  address  the  inherent  analytical  complexity  of  biomarker  datasets,  we  have  utilized  machine- 
learned  Bayesian  network  analysis.  Bayesian  theory  relates  the  conditional  independence  of  known 
events  in  order  to  compute  posterior,  or  unknown,  probabilities.19'20  Here,  we  have  applied  measured 
transcriptional  data  in  order  to  determine  the  posterior  probability  of  allograft  pathology.  While  there  are 
many  analytical  technologies  available,  we  have  focused  on  machine  learning  because  of  its  inherent 
ability  to  address  high-dimensionality  multivariate  data.  We  selected  Bayesian  networks,  specifically, 
because  the  graphical  models  produced  are  transparent  and  intuitive,  which  allows  the  researcher  to  more 
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readily  identify  underlying  mechanisms.  This  approach  enhances  many  classical  statistical  analyses  and 

2 1  22  •  23  24 

has  been  employed  for  diagnostic  and  prognostic  applications  in  oncology"  '  and  cardiology.  ' 

When  analyzed  using  Bayesian  analysis  alone,  several  conditional  relationships  become  apparent  within 
the  transcript  profiles.  Specifically,  ICAM-1,  IL-10,  CCL3,  and  CD86  are  all  conditionally  related  to 
outcome  when  modeling  gene  expression  related  to  immune  function  (GP1).  Additionally,  VCAM-1, 
MMP-9,  MMP-7,  and  LAMC2  are  conditionally  related  to  allograft  pathology  when  analyzed  with 
respect  to  fibrosis  (GP2).  As  TG  is  rarely  an  isolated  pathologic  lesion, s  the  association  of  IL-10,  CCL3, 
and  CD86,  although  not  histologically  evident,  may  represent  transcriptional  overlay  of  cell-mediated 
inflammatory  changes  within  the  allograft.25  Endothelial  activation  has  been  proposed  as  a  mechanism  for 
the  pathogenesis  of  TG,9  and  interestingly,  the  machine -learned  Bayesian  network,  GP1,  associated  two 
endothelial  adhesion  molecules,  ICAM-1  and  VCAM-1,  directly  to  allograft  pathology.  TG  is  also 
morphologically  characterized  by  duplication  of  the  basement  membrane,  and  appropriately,  laminin 
(LAMC2),  a  major  component  of  basement  membrane,  was  also  identified  as  a  vital  transcriptional 
indicator. 

While  many  of  these  associations  uncovered  by  the  Bayesian  model  are  novel,  several  well-established 
relationships  support  that  such  an  approach  has  biologic  relevance.  The  relationship  between  pathology 
and  cell  signaling  (chemokine  expression),  cell  trafficking  (adhesion  molecule  expression)  and  tissue 
remodeling  (MMP  expression)  as  demonstrated  by  this  analysis  is  supported  by  current  models  of  TG. 26 

29  30  9 

TG  is  believed  to  be  secondary  to  binding  of  donor  specific  antibodies  to  endothelium  with  residting 
stimulation  and  recruiting  of  secondary  mediators  leading  to  an  inflammatory  response.  9  30This 
inflammatory  response  and  subsequent  tissue  injury  has  been  associated  with  chemokine,  adhesion 
molecule  and  MMP  expression.  '  ’  ’  Additionally,  adhesion  molecule  expression  has  been  shown 

to  be  associated  with  both  chronic  disease  and  stable  function  in  renal  transplant  recipients. 32  As 
demonstrated  in  our  model,  alteration  of  chemokine  expression  has  been  linked  to  costimulatatory 
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molecules  (CD28,40L, 80,86)  and  1L-10  has  been  demonstrated  to  be  elevated  in  allografts  with  stable 

oo  -JA  '%/'  'jn 

function. ~  ’  '  Finally,  the  development  of  TG  and  Cd4  expression  has  also  been  well  characterized. 

38  9  Thus,  our  preliminary  data  suggest  that  machine-learned  Bayesian  models  may  elucidate  critical 
pathways  related  to  the  allograft  pathology  within  transcriptional  datasets. 

While  transcriptional  analyses  of  renal  transplant  biopsies  have  been  extensively  reported,  1  *' 39  40  this  is 
the  first  report  of  an  integrated  panel  with  a  graphical  tool  to  define  conditional  relationships  and 
potentially  assist  the  clinician  in  prediction  of  transplant-specific  diagnosis.  As  such,  the  ability  to  analyze 
several  biomarkers  at  once  offers  several  advantages  over  individual  endpoints.  This  allows  for  the 
consideration  of  temporal  changes  individual  biomarker  levels  vary  with  time  during  the  disease 
process.41  Further,  it  allows  for  the  capture  of  feedback  loops  and  inter-biomarker  dependencies  to 
improve  sensitivity  and  specificity.  Efforts  in  oncology  and  HIV  have  already  introduced  personalized 
medicine  into  clinical  reality.42  In  order  for  a  biomarker,  or  panel  of  biomarkers,  to  reach  clinical  utility  as 
part  of  a  personalized  medicine  approach,  it  would  ideally  meet  the  following  criteria:  inexpensive,  easy 
and  rapid  quantification,  detectable  early  in  the  course  of  the  disease  process,  repeatable,  and  have  a  high 
degree  of  sensitivity  and  specificity  43  44  Based  on  our  preliminary  data,  we  believe  that  our  panel  of 
biomarkers,  once  prospectively  clinically  validated,  may  address  these  goals  for  the  diagnosis  of 
transplant  pathology.  PCR  based  assays,  such  as  used  herein,  offer  the  potential  of  cost  effectiveness, 
repeatability,  and  rapidity  45  Additionally,  given  the  relatively  long-term  survival  of  renal  allografts, 
prospective  application  of  such  models  to  transplant  biopsies  may  allow  for  early  diagnosis,  therapy 
alterations,  and  guidance  in  clinical  trials.2'46 

Admittedly,  this  study  is  limited  by  the  current  dataset  size  and  disease  process.  The  dataset  employed 
included  patients  with  several  different  immunosuppressant  approaches.  While  this  reduces  uniformity, 
we  feel  that  it  enhanced  model  robustness  and  favored  broad  utilization  as  the  strategies  used  represent 
current  standard  practices.  TG  pathology  is  evolving,  so  there  are  limited  treatment  options.  This  limits 
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the  ability  to  judge  the  effectiveness  of  a  biomarker  panel  in  addressing  a  clinical  disease  process; 
however,  the  ability  to  predict  SF  as  described  does  offer  the  ability  to  affect  therapeutic  decisions. 
Patients  quantitatively  classified  by  the  network  as  SF,  as  opposed  to  ongoing  pathology,  may  benefit 
from  immunosuppression  weaning  rather  than  continuous  therapy,  which  has  inherent  toxicity,  infection, 
and  malignancy  risks.  Most  importantly,  the  retrospective  nature  of  this  study  does  not  allow  for  robust 
validation  of  the  predictive  models  developed  from  the  relationships  derived  from  the  transcriptional 
datasets.  This  will  require  additional  prospective  analysis  with  models  such  as  described.  Additionally, 
the  small  number  of  cases  available  for  analysis  renders  the  disease  specific  conclusions  not  definitive 
and  requires  further  analysis.  This  is  due  to  a  lack  of  processed  biopsies  prior  to  the  identification  of  TG 
via  histologic  findings.  Therefore,  given  that  many  of  the  relationships  uncovered  by  the  model  appear  to 
be  supported  by  other  investigators,  we  believe  that  future  analysis  would  be  able  to  validate  the  current 
model  and  develop  similar  models  for  disease  states  where  know  therapeutic  options  exist,  such  as  acute 
or  antibody  mediated  rejection 

In  renal  transplant  recipients,  we  have  demonstrated  the  ability  to  apply  machine-learned  Bayesian 
analysis  of  allograft  biopsy  gene  expression  to  establish  models  of  interdependent  relationships  and 
pathologic  probabilities.  The  machine  learning  approach,  which  highlighted  relationships  not  readily 
apparent  with  other  analytical  methods,  provides  a  overview  for  biomarker  analysis  of  stable  function 
allograft  as  opposed  to  one  exhibiting  transplant  glomerulopathy.  We  plan  to  continue  this  work  in  an 
effort  to  validate  our  biomarker  panel  for  use  as  surrogate  endpoints  of  clinical  trials  and  clinical  decision¬ 
making.  As  such,  this  effort  signifies  the  introduction  of  personalized  medicine  in  the  realm  of  organ 
transplantation. 
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FIGURES 


Figure  1.  Transcriptional  profile  comparing  stable  function  and  transplant  glomerulopathy  allografts 
(GP1  and  GP2).  Transcript  expression  levels  that  were  statistically  different  between  SF  (open  bars)  and 
TG  (closed  bars)  allografts  are  shown  (p<0.05).  TG  allografts  showed  significantly  greater  expression  of 
transcripts  related  to  A)  T-cell  activation  and  effector  function,  B)  costimulatory  molecules,  C) 
chemotaxis,  D)  inflammatory  cytokines  and  endothelial  activation,  E)  epithelial-mesenchymal 
transformation,  F)  cytoskeleton  structure,  and  G)  growth  factors  and  regulators  of  tissue  remodeling  when 
compared  with  SF  allografts.  Results  are  mean  //-fold  expression  relative  to  normal,  non-transplanted 
kidneys  and  depicted  on  a  logarithmic  scale.  Error  bars  represent  standard  error  of  the  mean  (SEM). 

Figure  2.  Bayesian  transcript  network  (GP1)  and  relationship  to  allograft  pathology.  A)  The  Bayesian 
transcript  network  structure  of  GP1  as  established  by  the  iterative  modeling  methods.  The  relative  (77-fold) 
expression  is  represented  for  selected  transcripts  in  three  equal-area  bins  with  associated  probability 
distributions  (blue  bar)  as  predicted  by  the  Bayesian  model.  In  this  model,  ICAM-1,  IL-10,  CCL3,  and 
CD86  were  critically  related  to  the  allograft  pathology  variable  ‘Dx’  (dashed  box)  as  indicated  by  their 
adjacent  location  in  the  network.  Multiple  additional  cytokine,  chemokine,  and  costimulatory  transcripts 
were  also  related  to  allograft  pathology  but  not  closely  as  indicated  by  their  distance  from  the  Dx  variable. 
Transcripts  not  related  to  allograft  pathology  are  outside  of  the  network  (BCL2,  BAX,  SKI,  CSF1).  B) 
With  a  SF  allograft  (set  evidence  is  indicated  by  a  black  bar),  the  expression  of  multiple  transcripts  are 
decreased  within  the  network,  where  decreased  expression  is  represented  in  green  and  increased 
expression  in  red.  The  degree  of  shading  represents  the  strength  of  the  transcript  relationship  to  the 
outcome,  where  darker  is  a  stronger  and  lighter  a  weaker  relationship.  C)  In  an  allograft  with  TG  (black 
bar),  the  expression  of  the  gene  transcripts  are  increased  in  the  Bayesian  network. 
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Figure  3.  Bayesian  network  probability  analysis  of  allograft  pathology  based  on  transcript  expression.  A) 
With  increased  expression  of  ICAM-1  (>1.84  fold),  EL- 10  (>16.9  fold),  and  CCL3  (>3.15  fold)  (black 
bars)  the  probability  of  a  TG  allograft  (dashed  box)  increases  to  99.67%.  B)  With  increased  expression 
(>8.89  fold)  of  the  costimulatory  molecule  CD86  (black  bar),  multiple  related  transcripts  are  also 
increased.  For  example,  with  increased  expression  of  CD86  the  probability  of  increased  expression  (>28.5 
fold)  of  CD40L  is  81.12%.  In  addition,  the  probability  of  a  TG  allograft  (dashed  box)  also  increased  to 
80.61%.  Black  bars  indicate  set  evidence  while  blue  bars  indicate  probability  distributions  within  each 
graph. 

Figure  4.  Bayesian  transcript  network  (GP2)  and  relationship  to  allograft  pathology.  A)  In  this  model, 
VCAM-1,  MMP-9,  MMP-7,  and  LAMC2  are  critically  related  to  the  allograft  pathology  (dashed  box).  In 
addition,  the  Banff  C4d  grade  was  included  in  this  dataset  and  is  also  critically  related  to  allograft 
pathology.  The  solid  box  indicates  porting  of  network  shown  in  panels  B  and  C.  B)  With  a  Banff  C4d 
grade  of  3  (black  bar),  the  probability  of  a  TG  allograft  (dashed  box)  is  81.25%.  C)  However,  with 
increased  expression  of  VCAM-1  (>1.96  fold),  MMP-9  (>5.34  fold),  MMP-7  (>2.77  fold),  and  LAMC2 
(>2.19  fold)  (black  bars)  the  probability  of  a  TG  allograft  (dashed  box)  increases  to  99.67%.  D)  Bayesian 
prediction  of  allograft  pathology  based  on  C4d  deposition.  E)  Bayesian  prediction  of  allograft  pathology 
based  on  combinations  of  LAMC2  and  MMP7  expression  levels.  In  panels  D  and  E,  “Probability  of 
Case”  reflects  the  occurrence  rate  of  specified  combination  within  the  dataset.  Low  (green)  to  high  (red) 


values  are  color  coded. 
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TABLES 


Table  1:  Gene  transcript  targets 


Gene  Panel  1 

Gene  Panel  2 

BAX 

EDN1 

ACTA2 

MMP9 

BCL2 

FASLG 

ACTN4 

NPHS1 

C3 

F0XP3 

AFAP 

NPHS2 

CCL2 

GAT  A3 

AGRN 

PDGFB 

CCL3 

GNLY 

ANGPT2 

S100A4 

CCL5 

GREM1 

BMP7 

SERPINE1 

CCR1 

GZMB 

CD2AP 

SERPINH1 

CCR5 

HLA-B1/3 

CDH1 

SMAD3 

CD28 

ICAM1 

CDH3 

SMAD7 

CD3E 

ICOS 

COL1A1 

SPARC 

CD4 

IFNG 

COL3A1 

SPP1 

CD40 

IL10 

COL4A2 

TGFB1 

CD40LG 

IL6 

CTGF 

THBS1 

CD80 

IL8 

CTNNB1 

TEMPI 

CD86 

MS4A1 

FAT 

TIMP2 

CSF1 

PDCD1 

FGF2 

TIMP3 

CTLA4 

PRF1 

FN1 

TIMP4 

CX3CL1 

SKI 

GREM1 

TJP1 

CX3CR1 

TBX21 

HSPG2 

TNC 

CXCL10 

TNF 

IGF1 

VC  AMI 

CXCL11 

FAMC2 

VEGF 

CXCL9 

MMP2 

VIM 

CXCR3 

MMP7 
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Table  2:  Patient  characteristics 


Variable 

SF 

TG 

p-value 

Number  of  patients 

19 

18 

NS1 

Recipient  age  (years) 

38+14 

42+14 

Pre -transplant  PRA  peak 

1.1  +3.0 

5.2+11.8 

NS1 

Donor  age  (years) 

34+10 

41  +13 

NS1 

Donor  type  (%) 

NS2 

Living  related 

5(26) 

9(50) 

Living  unrelated 

9(48) 

4(22) 

Deceased 

5(26) 

5(28) 

NS1 

HLA  mismatch 

3.6 +1.7 

3.4 +1.5 

Post-transplant  DSA  (%) 

1(5) 

10  (56) 

0.0013 

First  transplant  (%) 

18  (95) 

15  (83) 

NS3 

Death-censored  graft  loss(%) 

0 

4(22) 

0.0213 

Death  with  functioning  graft  (%) 

0 

3(8) 

NS3 

Follow-up  months 

65.3  +23.0 

70.3  +19.1 

NS1 

1 Student  t-test;  2 Chi  square; J Fisher  exact  test 
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Table  3:  Biopsy  characteristics 


SF 

TG 

p-value 

Number  of  biopsies 

32 

20 

Time  from  transplant  (months) 

23.31  ±19.3 

32.7  ±16.3 

NS1 

Creatinine  (at  time  of  biopsy) 

1.3  ±0.3 

2.5  ±1.1 

<0.00 11 

Biopsy  for  cause  (%) 

2(6) 

9(45) 

0.0012 

Minimal  changes 

2 

0 

Borderline  rejection 

0 

2 

Acute  humoral  rejection 

0 

1 

Clinical  toxicity 

0 

1 

Recurrent  MPGN 

0 

2 

IF/TA 

0 

3 

C4d  grade 

0.6  ±0.9 

1.6  ±1.1 

0.0131 

CG  grade 

0 

2.7  ±0.5 

<0.00 11 

IF/TA  grade 

0 

1.4  ±1.2 

<0.001* 

7  z 

Student  t-test;  Chi  squared 
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Table  4:  Fisher  exact  test  of  Bayesian  associations  in  GP1 


Association  Pairs 

p- value 

Dx 

ICAM1 

<0.001 

Dx 

IL10 

<0.001 

Dx 

CCL3 

<0.001 

CCL3 

CD86 

<0.001 

CD86 

CCL2 

0.010 

CD86 

CXCL11 

<0.001 

CD86 

CD40LG 

<0.001 

CD40LG 

IFNG 

0.004 

CD40LG 

CD28 

<0.001 

CXCL11 

CD80 

<0.001 

CD80 

CXCL10 

0.001 

CD80 

GNLY 

<0.001 

GNLY 

PRF1 

0.001 
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Table  5:  Fisher  exact  test  of  Bayesian  associations  in  GP2 


Association  Pairs 

p-value 

Dx 

VC  AMI 

<0.001 

Dx 

MMP9 

<0.001 

Dx 

Banff.  C4d 

0.001 

Dx 

MMP7 

<0.001 

Dx 

LAMC2 

<0.001 
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Table  6:  Sequential  Biopsy  Validation  by  Patient 


Patient 

Number 

Days 
Post  Tx 

Dx 

Probability 
of  SF 

Probability 
of  TG 

002-003 

1158 

SF 

0.038 

0.962 

1830 

SF 

0.977 

0.023 

002-014 

170 

SF 

0.153 

0.847 

338 

SF 

0.024 

0.976 

002-017 

177 

SF 

0.985 

0.015 

1094 

SF 

0.985 

0.015 

2088 

SF 

0.999 

0.001 

002-018 

366 

SF 

0.958 

0.042 

1099 

SF 

0.997 

0.003 

002-021 

1092 

TG 

0.119 

0.881 

1281 

TG 

0.002 

0.998 

002-023 

1146 

SF 

0.646 

0.354 

1840 

SF 

0.961 

0.039 

002-024 

381 

SF 

0.971 

0.029 

1099 

SF 

1.000 

0.000 

1721 

SF 

0.791 

0.209 

002-031 

194 

SF 

0.998 

0.002 

368 

SF 

0.726 

0.274 

002-033 

1287 

TG 

0.013 

0.987 

1078 

TG 

0.003 

0.997 

002-035 

198 

SF 

0.999 

0.001 

363 

SF 

0.969 

0.031 

002-044 

186 

SF 

0.810 

0.190 

371 

SF 

0.999 

0.001 

002-047 

177 

SF 

0.991 

0.009 

359 

SF 

0.997 

0.003 

002-048 

168 

SF 

0.998 

0.002 

421 

SF 

0.980 

0.020 

Tx,  transplant;  Dx,  diagnosis  as  defined  in  Methods;  SF,  stable 
function;  TG,  transplant  glomerulopathy;  probability  greater  than 
0.500  was  considered  a  positive  test. 
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